home *** CD-ROM | disk | FTP | other *** search
- This is only a rough draft - Megan 04/10/92
-
- Summary of IETF BOF on Network Statistics and Analysis
-
-
- 1. Introduction
-
- The purpose of this BOF is to instigate discussion
- and information exchange within the community concerning
- research in wide-area network traffic measurements.
- Five brief presentations of related research were made,
- followed by discussion of each.
-
- One theme of the BOF was to discuss exactly what kind
- of network instrumentation, measurement facilities, and
- types of measurements should be recommended to the Internet
- community. Many of us would like to encourage the managers
- of stub networks and routers to collect and make available
- information similar in spirit to the statistics
- that NSFNET makes available through Merit/NSFNET Information
- Services (NIS.NSF.NET). We hope this effort eventually
- evolves into an RFC, and eventually leads to a widespread
- cooperative effort. We freely admit that the road to success
- will be an iterative process, fraught with plenty of
- challenging technical details.
-
- The amount of space consumed by this data completely depends on the
- type of measurement. For example, collecting TCP SYN/FIN/RST packets
- could lead to hundreds of megabytes a day, depending on the collection
- site. Other methods, like sampling or recording the quantity of bytes
- sent to particular destination networks might require less than a hundred
- kilobytes a month. In the first case, the volume of trace data can be
- on the order of one to two percent of the traffic itself, with the resulting
- data possibly having to be sent by tape rather than electronic means to
- the location where the network analysis will happen.
-
- The Internet Activities Board (IAB) recently announced guidelines for
- measurement activities. RFC 1262 lists bounds that should be commonly
- acceptable. However RFC 1262 directly addresses invasive
- measurement activities, and is only marginally applicable
- to passive data collection. We believe we will have to face
- many new issues hitherto unaddressed. What we propose must honor
- the concerns and restrictions that individual networks may impose,
- yet thorough enough to capture the data that we need to accomplish
- the research goals, and should allow for flexibility. An example
- of a difficult issue to resolve is the privacy when using network
- addresses, in particular as workstations with their own IP addresses
- frequently map to individual users. Our efforts should address
- privacy measures, that still allow professional research to be
- conducted.
-
- Most likely, each of us has a different idea as to the data we need
- to have measured to achieve our various objectives. Below,
- we summarize these motivations and give a preliminary list of the
- measurements and trace data that we believe should be collected or capturable.
- We encourage you all to add to both the motivation list and chart
- of traces and measurements, and mail them back to wanchar@usc.edu
- for inclusion in this document.
-
-
- 2. Motivations
-
- 2.1 Artificial workload models (Danzig and Jamin)
-
- Good artificial workload models are needed to drive simulations of
- new resource management algorithms, flow control algorithms, and
- routing algorithms. The artificial workload models that we are
- developing consist of an application specific model (ftp, telnet, nntp, etc.)
- and an application arrival rate model that is stub network dependent.
- So far we have been able to identify applications from their port
- numbers. As new transport protocols emerge, we may need other mechanisms.
- Creating the application specific model requires full traces of TCP/IP
- packet headers. Creating the stub network specific model requires
- traces of TCP SYN/FIN/RST packets only. Most of our data has been collected
- with statspy or tcpdump from a machine on the same
- Ethernet segment as the stub network's gateway to the backbone.
- We would like to collect SYN/FIN/RST traces from hundreds of stub
- networks. Given current network bandwidth and usage, these traces
- can range to 200MB/day.
-
-
- 2.2 Network planning (Braun and Claffy)
-
- SDSC and UCSD are undertaking a network analysis effort
- with multiple goals of immediate applicability
- and interest to the Internet environment, with respect
- to both performance and ubiquity.
-
- Areas of current investigation include: measurements and
- analysis of resource consumption and latencies, network
- performance degradation under resource starvation, and
- end-to-end performance testing. We have determined, for
- selected data sets, characteristics of network usage by
- application, bandwidth requirements, and geographic distribution.
- We are also exploring the role that granularity plays in traffic
- analysis, both in statistical sampling of traffic on an
- operational basis, and in the level of detail
- one presents data to optimize the information/noise ratio.
-
- We are currently analyzing data from a variety of
- sources, including national networks as well as federal network
- interconnection points of multiple agencies.
- Statistical examination and manipulation of data reveals
- significant traffic correlations, trends, and dependencies.
-
- We are also undertaking collaborative efforts with Toshiya
- Asaba and the WIDE statistics working group in Japan.
- In particular, Asaba is largely responsible for the
- analysis scripts which facilitated statistical
- examination and data presentation. We first intended
- the scripts for use in a study of international traffic
- between Japan and other nations. We were able to adapt
- the script for use in subsequent studies. Building a
- public library of usable scripts for different analysis
- tasks requires agreement on data formats in multiple
- phases of collection and analysis. We would like to
- see a collaborative effort within the community toward
- accomplishing such a task.
-
- Further information and slides are available by
- sending requests to the SDSC Applied Network Research
- Group, via hwb@sdsc.edu or kc@sdsc.edu
-
- 2.3 Stateful router studies (Estrin and Mitzel)
- [Related information, though not participated at the BOF.]
-
- The current Internet is based on a stateless (datagram) architecture.
- However, many recent proposals rely on the maintenance of state
- information within network routers, leading to our interest in the
- implications of a ``stateful'' network layer.
- We wish to collect internetwork traffic traces at the border routers
- of stub and transit networks, and use this data to evaluate, or
- predict, the effects of design alternatives for stateful architectures.
-
- An important design decision is the level at which conversations are
- defined. This determines the granularity of control over the network
- traffic, and affects the scalability of the system. We are interested in
- several granularities of conversations, ranging from
- a single TCP application association, up to aggregation of all traffic
- between two communicating networks. We will use the data to estimate the
- number of active conversations at a router, and derive the
- storage requirements for the associated conversation state table. We
- will analyze the feasibility of fine grain control at the network
- periphery and deeper within the network.
-
- In conventional IP, the only lookup function normally required for
- packet forwarding is a routing table lookup. This has been recognized
- as a bottleneck in the forwarding process [Feldmeier, Jain].
- It has been shown that the introduction of an LRU cache can substantially
- improve the efficiency of the packet forwarding process. Route
- caching is used in many existing routers. However, unlike the
- stateful schemes investigated here, which require lookup based
- on source--destination pairs, current route caches are based only
- on destination host or network. It is not intuitively obvious whether
- the solutions developed for routing table caches can be applied here.
- We will use our network traffic traces to
- perform trace driven simulations of an LRU cache, for different
- conversation granularities, and thereby assess traffic locality and
- the benefits of caching.
-
-
- 2.4 Network monitoring (Schwartz and Pu)
-
- Schwartz proposed that a group of a dozen of us or so agree to
- collaborate to collect traces and measurements. He also described
- his recent study of FTP traffic which showed that tools to
- locate copies of large, replicated files may reduce wide area
- network traffic due to FTP. The unique aspect of Schwartz's traces
- was that it actually peered at application level data in a
- way that preserved privacy.
-
-
- 2.5 Host reliability and availability (Long)
-
- Long summarized his study of internet host reliability and
- availability. This was the only active form of tracing
- discussed during the BOF.
-
-
- 3. Measurements and traces
-
- Here is a first pass at the type of data we would like to see
- collected, and what studies would use this data. These categories
- need to be detailed, and new categories probably need to be
- filled in. The table identifies four types of data to collect.
- These include captured packets and packet headers (excluding
- data), headers of selected packets, summary data, and routing and
- congestion data. The first three types of data are pretty well
- defined, while the last is much less so. Although we can collect such
- data from anywhere in the Internet, we classify it into three
- classes: entrances to stub networks, regional and backbone
- routers, and international gateways.
-
- TYPE OF DATA
-
- | Captured |TCPDUMP |NSF.NIS.NET |Router |
- M | Packets & |Conversation|LIKE DATA |Timing and |
- E | Packet |SYN/FIN/RST |Data |Queue length|
- A | Headers |Traces | |(MIB) |
- S --------------------------------------------------------------
- U |Workload |Workload | | Congestion |
- R STUB |models |models | | studies |
- E NETWORKS | | | | |
- M |Workload | |Workload | |
- E |Planning | |Planning | |
- N --------------------------------------------------------------
- T | | | | |
- REGIONAL | | | | |
- AND |Stateful | | | Congestion |
- BACKBONE |Routers | | | studies |
- P NETWORKS | | | | |
- O |Workload | |Workload | |
- I |Planning | |Planning | |
- N --------------------------------------------------------------
- T | | | | Congestion |
- INTER- | | | | studies |
- NATIONAL | | | | |
- GATEWAYS | | | | |
- |Workload | |Workload | |
- |Planning | |Planning | |
- --------------------------------------------------------------
- Table 1.
-
-
- 4. Trace formats and tools
-
- We need to define the storage format for trace and statistical data.
- For some formats, like tcpdump or statspy, the format is already pre-defined.
- Almost certainly we should adopt NSFNET's current format for the type of data
- they collect. We also need to define ``sanitizer'' programs that implement the
- security concerns of particular networks.
-
- There is an operations area in IETF which has been defining some standard
- transport and storage formats for various kinds of operational data.
-
- Dealing with gigabytes of data is results in a serious resource impact.
- An effort has to be undertaken to identify schemes to make such large
- quantities of data useful, possibly via multiple levels of data reduction.
-
-
- 5. Mailing list:
-
- The current composition of wanchar@usc.edu is listed below.
- Change requests can be sent to wanchar-request@usc.edu
-
-
- afs@germany.eu.net
- ala@merit.edu
- amr@nri.reston.va.us
- asaba@isr.recruit.co.jp
- bac@sdsc.edu
- becker@ans.net
- boss@sunet.se
- brunner@practic.com
- calton@cs.columbia.edu
- carson@utcs.utoronto.ca
- cbagwell@gateway.mitre.org
- chris@wugate.wustl.edu
- cjw@nersc.gov
- cward@westnet.net
- dan@merlin.dev.cdx.mot.com
- danzig@usc.edu
- darrell@cse.ucsc.edu
- estrin@usc.edu
- fair@apple.com
- golding@cis.ucsc.edu
- goodwin@psc.edu
- gruth@bbn.com
- henry@oar.net
- hwb@sdsc.edu
- jamin@usc.edu
- jfl@nersc.gov
- jgodsil@ncsa.uiuc.edu
- jkay@cs.ucsd.edu
- jonchy@dxcoms.cern.ch
- jrc@uswest.com
- jun@wide.ad.jp
- kc@sdsc.edu
- kfall@cs.ucsd.edu
- korz@bach.cs.columbia.edu
- kr@concord.com
- lear@sgi.com
- lindahl@violet.berkeley.edu
- lwinkler@anl.gov
- mak@cnd.hp.com
- mak@merit.edu
- mankin@gateway.mitre.org
- martin@cearn.cern.ch
- medin@nsipo.nasa.gov
- morris@ucar.edu
- mws@sparta.com
- nevil@aukuni.ac.nz
- nitzan@ws1013.nersc.gov
- ogud@cs.umd.edu
- peter@usc.edu
- peterd@cc.mcgill.ca
- polyzos@cs.ucsd.edu
- probins@bubba.wpd.sgi.com
- pushp@cerf.net
- rama@erlang.enet.dec.com
- rbutler@ncsa.uiuc.edu
- rcollet@icm1.icp.net
- reschly@brl.mil
- rgc@qsun.att.com
- rin@qsun.att.com
- rj@sgi.com
- schwartz@cs.colorado.edu
- sherk@sura.net
- stats@nic.near.net
- suelin@ibm.com
- tmwalden@saturn.sys.acc.com
- tom@cic.net
- topolcic@nri.reston.va.us
- van@horse.ee.lbl.gov
- vcerf@nri.reston.va.us
- vern@horse.ee.lbl.gov
- vikas@jvnc.net
- vu@polaris.dca.mil
- whaley@ncsc.org
-